-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hysplit concentration read speedup #177
base: develop
Are you sure you want to change the base?
Conversation
Tested against output of hysplit.v5.2.2
OK, this is more complicated than I thought; This version breaks does not work for some output I have. |
This now handles empty columns, however, there might be more cases to consider that I do not know of. |
I ran into another edge case and fixed that. By peeling out some of the logic out of the inner loop I got another factor of ~2 speedup. |
Thanks! the reader could use some improvements and I appreciate this work on it. I will have time to review it and pull into hysplit development branch in sometime in beginning or mid August. |
For my use case of reading relatively large hysplit concentration grids, I found
readfile
to be much slower than the hysplit simulation itself.Looking at the code, 93% of the time was spend on iterative calls to
xr.merge
.I had to debug a bit, because it seems that, at least for my output of hysplit v5.2.2, the species and levels were flipped in order.
Building a list of lists and calling xr.merge and xr.concat yielded a significant speedup of roughly 15x, very worthwhile for me.
Tests are passing, but tests are not actually testing this part of the code.
BTW: I think further speed up would be possible by lifting the conversion to a pandas dataframe out of the innermost function.